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We show how dynamical decoupling (DD) and quantum error correction (QEC) can be optimally combined 
in the setting of fault tolerant quantum computing. To this end we identify the optimal generator set of DD 
sequences designed to protect quantum information encoded into stabilizer subspace or subsystem codes. This 
generator set, comprising the stabilizers and logical operators of the code, minimizes a natural cost function 
associated with the length of DD sequences. We prove that with the optimal generator set the restrictive local- 
bath assumption used in earlier work on hybrid DD-QEC schemes, can be significantly relaxed, thus bringing 
hybrid DD-QEC schemes, and their potentially considerable advantages, closer to realization. 



Introduction. — The nemesis of quantum information pro- 
cessing is decoherence, the outcome of the inevitable interac- 
tion of a quantum system with its environment, or bath. Sev- 
eral methods exist that are capable of mitigating this undesired 
effect. Of particular interest to us here are quantum error cor- 
rection (QEC) HMD and dynamical decoupling (DD) JSHg). 
QEC is a closed-loop control scheme which encodes informa- 
tion and flushes entropy from the system via a continual sup- 
ply of fresh ancilla qubits, which carry off error syndromes. 
DD is an open-loop control scheme that reduces the rate of en- 
tropy growth by means of pulses applied to the system, which 
stroboscopically decouple it from the environment. QEC and 
DD have complementary strengths and weaknesses. QEC is 
relatively resource-heavy, but can be extended into a fully 
fault-tolerant scheme, complete with an accuracy threshold 
theorem lfTQl - iT5l . DD demands significantly more modest re- 
sources, can theoretically achieve arbitrarily high decoherence 
suppression lfT6l423l . but cannot by itself be made fully fault- 
tolerant Il24l . 

A natural question is whether a hybrid QEC-DD scheme 
is advantageous relative to using each method separately in 
the setting of fault-tolerant quantum computing (FTQC). Typ- 
ically, improvements in gate accuracy achieved by DD mean 
that more noise can be tolerated by a hybrid QEC-DD scheme 
than by QEC alone, and that invoking DD can reduce the over- 
head cost of QEC. While early studies identified various ad- 
vantages [25-27], they did not address fault tolerance. A sub- 
stantial step forward was taken in Ref. [28], which analyzed 
"DD-protected gates" (DDPGs) in the FTQC setting. Such 
gates are obtained by preceding every physical gate (i.e., a 
gate acting directly on the physical qubits) in a fault toler- 
ant quantum circuit by a DD sequence. DDPGs can be less 
noisy than the bare, unprotected gates, since DD sequences 
can substantially reduce the strength of the effective system- 
environment interaction just at the moment before the physical 
gate is applied. The gains can be very substantial if the intrin- 
sic noise per gate is sufficiently small, and can make quantum 
computing scalable with DDPGs, where it was not with un- 
protected gates [28 1. 

The analysis in Ref. EBl assumed a "local" perspective. 
Rather than analyzing the complete FT quantum circuit, each 
single- or multi-qubit gate was separately DD-protected. This 



required a strong locality constraint limiting the spatial cor- 
relations in the noise, known as the "local bath" assumption. 
Unfortunately, many physically relevant error models violate 
this assumption lfT3UT5l . 

Here we aim to integrate DD with FTQC using a global per- 
spective. This appears to be necessary in order to achieve high 
order decoupling in a multi-qubit setting, under general noise 
models. Rather than protecting individual gates we shall show 
how an entire FT quantum register, including data and ancilla 
qubits, can be enhanced using DD. This will allow us to re- 
lax the restrictive local bath assumption. Along the way, we 
identify a DD strategy that takes into account the basic struc- 
ture and building blocks of FT quantum circuits, and identify 
optimal DD pulse sequences compatible with this structure, 
that drastically reduce the number of pulses required com- 
pared with previous designs. Such a reduction is crucial in 
order to reap the benefits of DD protection, for if a DD se- 
quence becomes too long, noise can accumulate to such an 
extent as to outweigh any DD enhancements. 

The noise model. — We assume a completely general noise 
Hamiltonian H acting on the joint system-bath Hilbert space, 
the only assumption being that < oo, where || • || denotes 
the sup-operator norm (the largest singular value, or largest 
eigenvalue for positive operators) ll29l . Informally, H con- 
tains a "good" and a "bad" part, the latter being the one we 
wish to decouple. H is fc-local, i.e., involves up to fc-body 
interactions, with k > 1. We allow for arbitrary interactions 
between the system and the bath, as well as between different 
parts of the system or between different parts of the bath. See 

Dynamical decoupling. — DD pulse sequences comprise a 
series of rapid unitary rotations of the system qubits about 
different axes, separated by certain pulse intervals, and gen- 
erated by a control Hamiltonian Hc{t). They are designed 
to suppress decoherence arising from the "bad" terms in H. 
This is typically manifested in the suppression or even van- 
ishing of the first N orders, in powers of the total evolu- 
tion time T, of the time-dependent perturbation expansion 
(Dyson or Magnus series |[30l ) of the evolution operator 
U(T) = T exp(-i J T H{t)dt), where H(t) is H in the "tog- 
gling frame" (the interaction picture generated by the DD 
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FIG. 1. Qubits and corresponding baths represented as white and 
black circles respectively. Bath operators corresponding to different 
operators inside a box do not necessarily commute, while they do if 
the baths are in different boxes. The Hamiltonians considered are 
general within each box, but not between them. In (a) a diagram 
of the "local bath assumption" used in Ref. 1 28 1 is shown, while (b) 
represents the general scenario considered in fault-tolerance 11314151 . 
In (c) we illustrate one of our key results: domains are allowed to 
grow logarithmically in the size of the problem the FTQC is solv- 
ing. The dark grey boxes represent such domains, each containing 
O[log(fctot)] physical qubits at the highest level of concatenation, 
where k tot is the total number of logical qubits. When two domains 
need to interact (light grey box), then the joint DD generator set is 
used and the locality of the bath is updated accordingly. 



pulse sequence Hamiltonian Hc(t)) [8 1, and T denotes time- 
ordering. When the first non-identity system-term of U(T) 
appears at 0(T N+1 ) one speaks of Nth order decoupling. 
Such DD sequences are now known and well understood [31 1. 

Most DD sequences can be defined in terms of pulses cho- 
sen from a mutually orthogonal operator set (MOOS), i.e., a 
set of unitary and Hermitian operators ft — {fli}^\, Of = 1 
(identity) Vi, and such that any pair of operators either com- 
mute or anticommute ETI . The generator set of a MOOS 

(gMOOS), ft = {^i}'=l> is defined as the minimal subset 
ft C ft such that every element of ft is a product of elements 
of ft but no element in ft is itself a product of elements in ft 
(32). All deterministic DD sequences are finitely generated, 
meaning that the pulses are elements, or products of elements, 
of a finite DD generator set (DDGS), which we identify with 
the gMOOS ft. 

The centralizer of the MOOS ft is Co := {A \ [A, ft] = 
0}, i.e., the set of operators which commute with all MOOS 
elements. A good example of a gMOOS is the generator set 
P n = {JW, Z w }? =1 , where (Z W ) denotes the Pauli-x 
(z) matrix acting on the ith qubit, of the Pauli group P n = 
Pf n on n qubits (the group of all n-fold tensor products of 
the standard Pauli matrices P\ = {1, X, Y, Z}, modulo Z2). 
For simplicity, since we will be dealing with qubits and are 
particularly interested in decoupling sequences that allow for 
bitwise pulses, we shall assume henceforth that ft C P n . It 
is necessary to recast the notion of decoupling order in the 
MOOS scenario, since the previously mentioned notion turns 



out to be too strong for our purposes. 

Note that any operator A can decomposed as A = Aq + A r , 
where Aq (A r ) denotes the component that commutes (does 
not commute) with all elements of a MOOS, i.e., A E Co- 
We shall say that a pulse sequence with generator set ft lasting 
total time T achieves "Ath order O-decoupling" if the joint 
system-bath unitary evolution operator at the conclusion of 
the sequence becomes 
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The subspace invariant under Co has therefore been decou- 
pled, in the sense that terms not commuting with Co appear 
only in 0(T N+1 ). Thus the choice of the pulse generator set 
ft determines what subspace(s) can be decoupled, and con- 
versely a subspace one is interested in decoupling to arbitrary 
order implies a choice of ft. 

Optimization of the DDGS. — We define the cost of a DD se- 
quence as the total number of pulse intervals it uses to achieve 
Ath order f2-decoupling. For all known DD sequences (even 
those optimized for multiple qubits |33l), the cost is at least 



f(N) 



|0| 
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and f(N) depends on the particular DD sequence. Pulse inter- 
val optimization has already reduced f(N) from 2 N for CDD 
to N + 1 for NUDD [ 3JJ . Here we are concerned instead with 
the optimization of the cost exponent |fi|, to which end the 
following theorem will prove to be crucial (for the proof see 
ED 02: 



Theorem 1 Let B be a subgroup of the the Pauli group P n , 
generated by B. Consider a DDGS ft C P n which decou- 
ples B in the sense that the only element in the intersection 
between C^i and B is 1. Then |f2|>|_B|. Moreover, the DDGS 
ft = B decouples B in the desired sense, and automatically 
saturates the bound. 

As an immediate application, we reproduce the well-known 
result that ft = P n , and hence |r2| = 2n, is optimal for n 
qubits without encoding [8]. Indeed, in this case the most 
general noise Hamiltonian is spanned by the elements of the 
"error group" B = P n , so \B\ = 2n and thus by Theorem [T] 
for any DDGS ft it must be that \ fl\ > 2n. On the other 
hand ft — P n indeed decouples P n since Co = 1- Note 
also that Eq. ( |2c| i yields Hq ' N (T) oc 1. Moreover, since DD 
sequences are known that achieve A^th order ^-decoupling 
for n > 1 qubits (specifically CDD 021 and NUDD |2"T1 . 
with explicit P„-based constructions given in Ref. ETI ). the 
generating set P n is the smallest one capable of achieving Ath 
order decoupling of a general n-qubit Hamiltonian. However, 
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as we discuss next, there is a better choice for the purpose of 
protecting a code subspace. 

DD generator set for a QEC code. — Consider a set of n 
physical qubits encoding k logical and r gauge qubits via 
some distance d code, i.e., an [[n, k, r, d}] subsystem code 
Il36ti38l (or an [[n,A;,d]] stabilizer code for r = 0), 
subject to the general noise model described above. Let 
S = {5 M }^ =1 denote the stabilizer generators, where Q = 

n - (k + r), let L = {X^ , Z { p}\=\ denote the logical- 
operator generators of the code, and G = {A„, Z„}£ =1 the 
gauge operator generators. In the [[n, k, d]] code case, each er- 
ror correctable by the code maps a codeword to an syndrome 
subspace labeled by an error syndrome, i.e., a sequence of 
±1 eigenvalues of the stabilizer generators [4|. In order to 
properly integrate DD with QEC, we require a set of DD gen- 
erators tt which preserves the error syndromes to order N, 
i.e., such that Hq S ' N acts trivially on each of the syndrome 
subspaces and does not mix them, so that at the conclusion of 
the sequence the original noise model for which the code was 
chosen, is preserved (again, to order N). This form of the TVth 
order fi-decoupling requirement will enable error correction 
to function as intended. A key observation is that in light of 
this, we do not need to protect the complete 2" -dimensional 
Hilbert space W, but rather the 2 n ~ k syndrome subspaces. 

To this end we propose to choose a complete set of stabi- 
lizer and logical operator generators as DD sequence genera- 
tors, i.e., let O = S U L [39]. We refer to any DD sequence 
having a DDGS of this type as an "SLDD" sequence. With 
this choice, Co = S C Cl. Now note that if Co C ft, then 
the elements in Co commute and they define |Co| = 2' Cr! l 
subspaces characterized by their eigenvalue under the action 
of Co- In this case we have independent Ath order ft- 
decoupling of each of these subspaces. In other words, H^ S:N 
leaves each of the syndrome subspaces invariant and does not 
mix them, as desired. Note that the choice O = S U L also 
applies to subsystem codes [ 36-38 1. In this case each of the 
syndrome subspaces can be decomposed as %i og icai ® 'Heauee, 
where "Hiogkai is invariant under SUG = C^, since the gauge 
operators act non-trivially on H gaU ge only. Before proving its 
optimality, we next compare the cost of the SLDD sequence 
to decoupling the entire Hilbert space. 

Relative cost of SLDD. — For an [[n, k, d]] code and an 
SLDD sequence, the number of stabilizer generators (n — k) 
plus logical operator generators (2k) yields \tl\ = n + k, 
which means that c Su£ = f(N) n+k < c Pn = f{N) 2n . Of- 
ten n ^> k, so that c Su l ~ y/cp~- ^ n tne case °f [[ n i k, r, d]] 
subsystem codes [38 1 the advantage is more pronounced: the 
number of stabilizers is n—k—r, so \ ft\ = n+k—r. As an ex- 
ample consider the Bacon-Shor [[to x to, 1, (m — l) 2 , 3]] sub- 
system code [36 1, which has the highest (analytically) known 
fault-tolerant threshold for error correction routines with lfl"3l 
and without measurements [40 1. In this case one would have 
c sul = f(N) 2m = (cp 2 ) 1//,Tl , a polynomial advantage that 
grows with the block size to. 

Choice of DDGS for protecting ancilla states. — The protec- 



tion of certain ancilla states is also an important part of fault 
tolerance. Such states can be thought of as QEC codes with 
small stabilizer sets. E.g., |cat m ,+) = (|0)® a + |l)® a )/%/2 
is often used for fault-tolerant stabilizer measurements or for 
teleportation of encoded information. The stabilizer is gener- 
ated by {A® a , {Z t Z t+1 }^}, and equals the DDGS. 

Decoupling multiple subspaces or subsystems. — How 
should one choose an optimal DDGS to decouple different 
subspaces simultaneously? Assume that there are distinct and 
non-overlapping sets of {rii} physical qubits comprising a 
quantum register, e.g., a complete register comprising k log- 
ical qubits, along with the corresponding ancillas. Assume 
that they are partitioned into sets of sizes {ki}^ =1 , such that 
k = Y2i=i k«> an d mat eacn se * i is encoded in some sub- 
system (or subspace) code [[n^, ki, r i7 d^\. For each block 
of ki logical qubits we have an SLDD sequence with DDGS 
Cli = Sj U L,;. Let the Hamiltonians of the different sets be 
{Hi}, and spanned by the error groups {Bi c P ni }- Using 
Theorem |T| it follows that if tli is optimal for error group 
Bi then O tot = U^O^ optimally decouples the joint Hamil- 
tonian spanned by U^f^. This form of composing a larger 
DDGS out of smaller modules guarantees that each term of 
a general Hamiltonian acting on the whole register must an- 
ticommute with at least one element in r2 tot , which in turn 
implies that tt tol , used to construct, e.g., a CDD or NUDD se- 
quence, is capable of independent Ath order r2 tot -decoupling 
of each subspace or subsystem. 

Optimal DDGS for concatenated QEC codes. — Many 
FTQC constructions are based on concatenated QEC codes 
fiTll . so what is the optimal DDGS for this case, cost-wise? 
Suppose an [[n, k,r, d]] code is concatenated R times. A 
complete generator set for all the stabilizers of such a code 
is given by U^ =1 S^ q \ where is the stabilizer generator 

set of concatenation level q. Let LA R ) denote the set of i?th- 
concatenation level logical generators. 

Theorem 2 The optimal DDGS for decoupling all the syn- 
drome subspaces at concatenation level R is tlie SLDD set 
n = uf =1 S^ U t( R \ where = n R - (k + r) R + 2k R . 

Note that there are alternatives to this "top-level" SLDD strat- 
egy; e.g., one could concatenate the DDGS for each block at 
each level q, but this would result in exponentially more DD 
pulses. Note also that by setting R = 1 Theorem [2] reduces 
to the optimality of SLDD for subspace or subsystem codes, 
with |r2| = n + k — r as claimed above. The subspace case is 
recovered by setting r = 0. 

Proof. The number of physical qubits after R levels of con- 
catenation of any [[n, k, r, d]] subsystem stabilizer code is 
n{R) = n R , and the error group for the entire Hilbert space 
is the Pauli group We need to protect the 2^^ syn- 

drome subspaces, where Q(R) = \ U R =1 is the total 
number of stabilizer generators after the code is concatenated 
R times. Q(R) = n(R) - L{R) - G(R), where L(R) = k R 
[G(R)] is the number of logical (gauge) qubits at level R, and 
L{R) + G(R) = (k + r) R ED. 
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The SLDD sequence generated by f2 = U^ =1 S (?) U 

satisfies the requirement of independent iVth order f2- 
decoupling of the 2®( RS> syndrome subspaces since the stabi- 
lizers (as DD pulses) remove the errors at each level q, logical 
included (recall that a logical error at level q— 1 anticommutes 
with at least one level q stabilizer generator), but not the log- 
ical errors at the top level, for which we need L^' as DD 
pulses. Moreover, for this sequence — Q(R) + 2L{R) = 
n R — (k + r) R + 2k R as claimed. Thus what remains is to 
prove its optimality. 

Any operator in P n (R) which is not a stabilizer or gauge 
operator acts as an error either within or between syndrome 
subspaces. Thus our choice of code dictates which elements 
°f Pn(R) ac t as errors, and clearly this error set is precisely 
B = P n (R)/Cn, where the centralizer generator is Co = 
Uf =l S^ u G(''. We have |C n | = Q(R) + 2G{R). On the 
other hand \B\ = 2n(R) - |C n | = Q(R) + 2L(R), so that 
\B\ = \ fl\ and B OCa = 1, which proves the optimality of 
| $7 1 by virtue of Theorem [7] ■ 

Optimizing the choice of DDGS for a complete quantum 
register: beyond the local bath assumption. — We have now 
assembled and described all the ingredients for optimally 
combining DD with FTQC for protection of a complete quan- 
tum register. However, we must ensure that the cost of imple- 
menting the DD sequence does not spoil quantum speedups. 
To this end we consider once more an [[n, k, r, d]] subsystem 
code concatenated R times, used to encode an entire quan- 
tum register, and divide the register into d(R) domains (e.g., 
a code block along with ancillas) of size ku(R) = 0(k R ) 
logical qubits, such that the total number of logical qubits 
in the register is k tot = d(R)ku{R). We then optimally de- 
couple the ith domain using an SLDD sequence generated by 
«i = uf =1 S^ } U L^'K i G {1, . . . d(R)} (where s\ q) and 

act non-trivially only on the qubits in the domain i), and 
ask for the maximal allowed size of each domain such that 
the DD sequence cost scales polynomially in fc tot , as this will 
ensure that any exponential quantum speedup is retained. 

Corollary 1 In a fault tolerant quantum computation the 
maximal allowed domain size compatible with a DDGS hav- 
ing cost Ctl = /(TV) 1 " 1 = poly(ktot). " O[log(fc tot )]. 

Proof. We assume that the total cost per domain is Eq. <|3j 
as it captures all known DD sequences. Theorem [2] shows 
that \ft\ = 0[n R - (k + r) R + 2k R ] (the O symbol is 
used since we allow for the presence of ancillas in the do- 
main). We may assume that the code has parameters such that 
n ~ r ~ k, so that |n| = 0(k R ) = 0[k D (R)}. Now recall 
that R = O [log log(fctot)] m a fault- tolerant simulation of a 
quantum circuit [43 1 . Therefore = /(iV)l n l = poly(k to t) 
requires k D (R) = 0[log(poly(k tot ))] = O[log(k tot )]. ■ 

Corollary [T]means that we can relax the local bath assump- 
tion, an assumption tantamount to assuming constant domain 
size ko < 2 [28 1; instead we find that domains are allowed 
to grow logarithmically with problem size. When two do- 



mains i and j are required to interact, the joint DDGS fl{ U ttj 
should be used [see Fig.[TJc)]. If the result is that at the high- 
est concatenation level the noise per gate has been reduced (as 
shown explicitly for the local bath setting in Ref. |28|), then 
a reduction in the number of required concatenation levels is 
enabled, hence reducing the overall overhead, or the effective 
noise threshold. 

Enhanced fidelity gates via DD. — So far we discussed 
the problem of protecting stored quantum information; what 
about computation? Quantum logic operations can be 
combined with DD, e.g., using "decouple while compute" 
schemes l44l [45), or (concatenated) dynamically corrected 
gates [(C)DCGs] for finite-width pulses [46|, or dynamically 
protected gates [28 1 in the zero-width (ideal) pulse limit. The 
optimal SLDD scheme introduced here is directly portable 
into the latter two schemes, since they use the same DD build- 
ing blocks and the associated group structure. It is impor- 
tant to emphasize that SLDD sequences require only bitwise 
(i.e., transversal) pulses, and can be generated by one-local 
Hamiltonians, thus not altering the assumptions of the CDCG 
construction. More importantly, the polynomial scaling guar- 
anteed by Corollary [T] also applies in the quantum logic sce- 
nario, thus allowing, in principle, a fidelity improvement with- 
out sacrificing the speedup of quantum computing. 

Conclusions and outlook. — All known DD sequences scale 
exponentially with the cardinality of their generating sets 
[Eq. ([3j]. In this work we identified the optimal generating set 
in the general context of protection of encoded information. 
This allowed us to show how DD and FTQC can be optimally 
integrated. In doing so we relaxed the local-bath assumption 
and showed that it can be replaced with domains growing log- 
arithmically with problem size. Two important open problems 
remain: to demonstrate that DD-enhanced FTQC results in 
improved resource overheads and lower noise thresholds, and 
to identify, or rule out, multi-qubit DD sequences with sub- 
exponential scaling in the the cardinality of their generating 
sets. 
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